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We say that a first order formula $ distinguishes a structure M over a 
vocabulary L from another structure M over the same vocabulary if <I> is 
i -^h ' true on M but false on M'. A formula $ defines an L-structure M if <J> 

distinguishes M from any other non-isomorphic L-structure M'. A formula 
$ identifies an n-element L-structure M if $ distinguishes M from any other 
non-isomorphic n-element L-structure M' . 
£NJ ■ We prove that every n-element structure M is identifiable by a formula 

with quantifier rank less than (1 — ^c)n + k 2 — k + 2 and at most one quantifier 
^-j. | alternation, where k is the maximum relation arity of M. Moreover, if the 

' automorphism group of M contains no transposition of two elements, the 

same result holds for definability rather than identification. 
£f~^ \ The Bernays-Schdnfinkel class consists of prenex formulas in which the ex- 

istential quantifiers all precede the universal quantifiers. We prove that every 
n-element structure M is identifiable by a formula in the Bernays-Schonfinkel 
class with less than (1 — 2fc l +2 ) n + k quantifiers. If in this class of identifying 
formulas we restrict the number of universal quantifiers to k, then less than 
n — ^fn + k 2 + k quantifiers suffice to identify M and, as long as we keep the 
number of universal quantifiers bounded by a constant, at total n — 0(y/n) 
^ | quantifiers are necessary. 
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1 Introduction 



Let M be a structure over a vocabulary L. A closed first order formula $ with 
relation symbols in L U {=} is either true or false on M. If M' is another L- 
structure isomorphic with M, then $ is equally true or false on M and M'. On 
the other hand, if M is finite and M' is non-isomorphic with M, then there is a 
formula $m,m' that is true on M and false on M'. As it is well known, for infinite 
structures this is not necessary true. In this paper, however, we deal only with finite 
structures. We call the number of elements of a structure M its order. 

If a first order formula $ is true on M but false on M', we say that $ distinguishes 
M from M'. We say that $ defines an L-structure M if <3> distinguishes M from any 
other non-isomorphic L-structure M' . Furthermore, a formula $ identifies a finite 
L-structure M if $ distinguishes M from any other non-isomorphic L-structure M' 
of the same order. 

We address the question how simple a formula identifying (defining) a finite 
structure can be. The complexity measure of a first order formula we use here is the 
quantifier rank, that is, the maximum number of nested quantifiers in a formula. 
Let I (M) (resp. D (M)) denote the minimum quantifier rank of a formula identify- 
ing (resp. defining) a structure M. We will pay a special attention to formulas of 
restricted logical structure. The alternation number of a formula $ is the maximum 
number of quantifier alternations over all possible sequences of nested quantifiers un- 
der the assumption that $ is reduced to its negation normal form, i.e., all negations 
are assumed to occur only in front of atomic subformulas. By b(M) and Dj(M) we 
denote the variants of I (M) and D (M) for the class of formulas with alternation 
number at most /. 

We will estimate I (M) and D (M) as functions of the order of M. The latter is 
denoted throughout the paper by n. A simple upper bound for I (M) is 



where ^ m is the conjunction that gives an account of all relations between elements 
of M and negations thereof. For example, if M consists of a single binary relation 
R M on the set {!,..., n}, then 



It is an easy exercise to show that, if M has only unary relations, then lo(M) < 
(n + l)/2. In [14] we prove the following results. If M has only unary and binary 
relations, then L(M) < (n + 3)/2. In the particular case that M is an ordinary 
undirected graph, we are able to improve on the alternation number by showing 



I (M) < n. 



Indeed, every structure M is identified by formula 




(1) 



f\ R(xi,Xj)A f\ -iR(xi,Xj). 

(i,j)eR M (i,j)fR M 
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that then lo(M) < (n + 5)/2. It is not hard to show that these bounds are tight up 
to a small additive constant. If M is a /c-uniform hypergraph, we have the bound 
h(M) < (l-l/k)n + 2k- 1. 

Here we continue the research initiated in [14] and prove a general upper bound 

Ii(M) < (l--L^j n + k 2 -k + 2, (2) 

where k, here and throughout, denotes the maximum relation arity of the vocabu- 
lary L. 

A simple upper bound for D (M) is 

Di(M) < n+ 1. 
An appropriate defining formula is 

(n 
/\ (Xi^Xj) A \J(x n+1 =Xi) A *m(^1, • • • ,Xn) 
l<*<i<" i=l 

where is as in (1). The upper bound of n + 1 is generally best possible. For 
example, we have D (M n ) — n + 1 if M n consists of the single totally true unary 
relation or is a complete graph on n vertices. However, for a quite representative class 
of structures we are able to prove a better bound. We call a structure irredundant 
if its automorphism group contains no transposition of two elements. Similarly to 
(2), for any irredundant structure M we obtain 

Di(M) < (l-^jn + k 2 -k + 1. (3) 

This is a qualitative extension of a result in [14], where the bound Di(M) < n/2 + 2 
is proved for any irredundant structure M with maximum relation arity 2. On the 
other hand, there are simple examples of irredundant structures with D (M) > n/4 
(see Remark 4.4). 

In fact, the bound Di(M) < (1 — j^)n+k 2 — k + 2 may not hold only for structures 
with a simple, easily recognizable property. Namely, given elements u and v of M, 
let us call them similar if the transposition of u and v is an automorphism of M. It 
turns out that, either we have the upper bound for Di(M) or otherwise M has more 
than (1 — ^u)n + (k — l) 2 pairwise similar elements. In the latter case we are able 
to easily compute the value of D (M) up to an additive constant of k. For graphs 
such a dichotomy result was obtained in [14]. 

Furthermore, we address the identification of finite structures by formulas of 
the simplest logical structure, namely, those in the prenex normal form (or prenex 
formulas). In this case the quantifier rank is just the number of quantifiers occurring 
in a formula. Let Si (resp. Hi) consist of the existential (resp. universal) prenex 
formulas. Furthermore, let Sj (resp. Hi) be the extension of Sj_i UHi-i with prenex 
formulas whose quantifier prefix begins with 3 (resp. with V) and has less than i 
quantifier alternations. In particular, E 2 is the well-known Bernays-Schonfinkel class 
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of formulas (see [4] for the role of this class in finite model theory). Define Pj(M) to 
be the minimum number of quantifiers in a E, U IT formula identifying a structure 
M. Similarly, let BS (M) be the minimum number of quantifiers of an identifying 
formula in the Bernays-Schonfinkel class E2. We hence have the following hierarchy: 



The upper bound of n is here due to the identifying formula (2). The bound Pi(M) < 
n is generally best possible. It is attained, for example, if M consists of the single 
unary relation true on all but one elements of the structure. 

Our concern becomes therefore BS (M), the next member at the top of the 
hierarchy (4). We prove that 



Though the multiplicative constant in (5) is worse than that in the bound (2), the 
bound (5) may be regarded as a qualitative strengthening of (2) because the class 
of formulas in the former result is much more limited than that in the latter result. 
Curiously, the bound (5) strengthens the bound (2) also quantitatively if we consider 
a somewhat unusual complexity measure of a formula, namely, the total number of 
quantifiers occurring in it. 

If we restrict the number of universal quantifiers to a constant, Bernays-Schonfin- 
kel formulas become much less powerful. Let BS g (M) denote the minimum total 
number of quantifiers in a Bernays-Schonfinkel formula identifying M with at most q 
universal quantifiers. We prove that BS fc (M) < n - y/n + k 2 + k and that BS 9 (M) > 
n — 0(y/n) as long as q is bounded by a constant. 

To prove (2), we use the characterization of the quantifier rank of a formula 
distinguishing structures M and M' as the length of the Ehrenfeucht game on M and 
M' [3] (an essentially equivalent characterization in terms of partial isomorphisms 
between M and M' and extensions thereof is due to Fraisse [5]). Unlike (2), our 
proof of (5) uses a direct approach. Nevertheless, both the results share the same 
background which is based on the notion of a base of a structure M. 

Given a set X of elements of M and elements u and v of M, we say that X 
separates u and v if the extension of the identity map of X onto itself taking u to v 
is not a partial automorphism of M. Clearly, no X can separate similar u and v. On 
the other hand, if X separates every two non-similar elements in the complement 
of X, we call X a base of M. Every M trivially has (n — l)-element bases. Our 
technical results imply that a considerably smaller base always exists. 1 

1 In fact, we do not state this explicitly. However, it is easy to derive from the estimate (39) 
that every structure has a base with less than (1 — 2 k*+i ) n elements. On the other hand, there 
are structures whose all bases have at least n/2 elements. A simple example is given by the graph 
with 171 pairwise non-adjacent edges. 



I (M) < li-!(M) < P,(M) < P^(M), i > 1; 
P 2 (M) < BS (M) < Px(M) < n. 



(4) 




(5) 



5 



Related work. Our paper is focused on the descriptive complexity of individual 
structures as opposed to the descriptive complexity of classes of structures. The 
latter is the subject of a large research area, which is emphasized much on the 
monadic second order logic (we refer the reader to the survey [4] and textbooks 
[2, 8]). 

The identification of graphs in first order logic is studied in [9, 10, 1, 6, 7] in 
aspects relevant to computer science. The main focus of this line of research is on 
the minimum number of variables in an identifying formula, where formulas are in 
the first order language enriched by counting quantifiers. This complexity measure 
of a formula corresponds to the dimension of the Weisfeiler-Lehman algorithm that 
succeeds in finding a canonic form of a graph [1]. 

The present paper studies, in a sense, the worst case descriptive complexity of a 
structure. Two other possibilities, the "best" and average structures, are considered 
in [13] and [11] in the case of graphs. 

Organization of the paper. In Section 2 we explain the notation used through- 
out the paper, recall some basic definitions, define the Ehrenfeucht game and state 
its connection to distinguishing non-isomorphic structures in first order logic. In 
Section 3 we introduce some relations, partitions, transformations, and construc- 
tions over a finite structure and explore their properties. The main task performed 
in this section is construction of a particular base in an arbitrary structure. We will 
benefit from these preliminaries while proving our both main results, bounds (2) 
and (5), in Sections 4 and 5 respectively. In Section 4 we also prove the bound (3) 
and the other definability results. Section 6 is devoted to identification by Bernays- 
Schonfinkel formulas with bounded number of universal quantifiers. In Section 7 we 
focus on graphs and improve the bound (5) for this class of structures. We conclude 
with a list of open problems in Section 8. 

2 Background 
2.1 Notation 

Writing u G U k for a set U and a positive integer k, we mean that u — (ui, . . . , u^) 
with Ui E U for every % < k. lfu,v E U, then denotes 2 the result of substituting 
v in place of every occurrence of u in u and substituting u in place of every occurrence 
of v in u. Here (uv) denotes the transposition of u and v, that is, the permutation 
of U interchanging u and v and leaving the remaining elements unchanged. Given 
a function defined on U, we extend it over U k by <f>(u) = (<f>(ui), . . . ,(f>(uk)) for 

u e U k . 

Notation idjy stands for the identity map of a set U onto itself. The domain and 
range of a function / are denoted by dom / and range / respectively. By we 
denote the /c-fold composition of /. 

2 Thc double use of the character u here should not be confusing: We will often use u to denote 
a single clement of a sequence u. 
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2.2 Basic definitions 

A k-ary relation R on a set V (or a relation R of arity k) is a function from V k 
to {0, 1}. A vocabulary is a finite sequence Ri, . . . ,R m of relation symbols along 
with a sequence ki, . . . , k m of positive integers, where each k{ is the arity of the 
respective Ri. If L is a vocabulary a finite structure A over L (or an L-structure 
A) is a finite set V(A), called the universe, along with relations Rf, . . . , it^, where 
Rf has arity fcj. The order of A is the number of elements in the universe V(A). If 
U C V(j4), then A induces on U the structure -A[C/] with the universe V(A[C/]) = U 
and relations R^ U \ . . . , R^ such that i?f [(7 'a = Rfa for every a G U ki . Two L- 
structures A and B are isomorphic if there is a one-to-one map : V(j4) — > V(.B), 
called an isomorphism from A to B, such that Rfa = Rf(pa for every i < m and all 
a G V(A) ki . An automorphism of A is an isomorphism from A to itself. If U C V(A) 
and C V(.B), we call a one-to-one map : U — > a partial isomorphism from 
A to B if it is an isomorphism from A[C/] to 

Without loss of generality we assume first order formulas to be over the set of 
connectives {->, A, V}. 

Definition 2.1 A sequence of quantifiers is a finite word over the alphabet {3, V}. 
If S is a set of such sequences, then 3S (resp. VS) means the set of concatenations 
3s (resp. Vs) for all s G S. If s is a sequence of quantifiers, then s denotes the result 
of replacement of all occurrences of 3 to V and vice versa in s. The set S consists of 
all s for s G S. 

Given a first order formula $, its set of sequences of nested quantifiers is denoted 
by Nest(<I>) and defined by induction as follows: 

1) Nest($) = {A} if $ is atomic, where A denotes the empty word. 

2) Nest(^$) = Nest($). 

3) Nest($ A *) = Nest($ V = Nest($) U Nest(tf). 

4) Nest(3x$) = 3Nest($) and Nest(Vx$) = VNest($). 

The quantifier rank of a formula $, denoted by qr($), is the maximum length 
of a string in Nest($). 

Given a sequence of quantifiers s, let alt(s) denote the number of occurrences of 
3V and V3 in s. The alternation number of a first order formula $ is the maximum 
alt(s) over s G Nest($). 

Given an L-structure A and a closed first order formula $ whose relation symbols 
are from L U {=}, we write A \= $ if $ is true on A and A \£ $ otherwise. Given A, 
a formula ty(xi, . . . , x m ) with m free variables xi, . . . , x m , and a sequence a±, . . . ,a m 
of elements in V(A), we write A, ai, . . . , a m \= ^(xi, . . . , x m ) if ^(xi, . . . , x m ) is true 
on A with each assigned the respective Oj. 

If i? is another L-structure, we say that a formula $ distinguishes A from B if 
A |= $ but B y= $. We say that $ defines an L-structure A (up to an isomorphism) 



7 



if $ distinguishes A from any non- isomorphic //-structure B. We say that <£> identifies 
an /-structure A of order n (up to an isomorphism in the class of L-structures of the 
same order) if <3> distinguishes A from any non-isomorphic L-structure B of order n. 

By D (A, B) (resp. Di(A, B)) we denote the minimum quantifier rank of a for- 
mula (resp. with alternation number at most I) distinguishing a structure A from 
a structure B. By D (A) (resp. Di(A)) we denote the minimum quantifier rank of 
a formula defining A (resp. with alternation number at most I). By 1(A) (resp. 
li{A)) we denote the minimum quantifier rank of a formula identifying A (resp. with 
alternation number at most I). 

Lemma 2.2 Let A be a finite structure over vocabulary L. Then the following 
equalities hold true: 



where = denotes the isomorphism relation between L-structures. 

Proof. We prove the first equality; The proof of the others is similar. Given an 
L-structure B non-isomorphic with A, let $b be a formula of minimum quantifier 
rank distinguishing A from B, that is, qr($ B ) = D (A, B). Let R = max B qr(<3> s ). 
We have D (A) > R because D (A) > D (A, B) for every B. To prove the reverse 
inequality D (A) < R, notice that A is defined by the formula $ = Ab $ b whose 
quantifier rank is R. The only problem is that $ is an infinite conjunction (a FO^- 
formula). However, as it is well known, over a fixed finite vocabulary there are only 
finitely many inequivalent first order formulas of bounded quantifier rank (see e.g. 
[1, 2, 8]). We therefore can reduce $ to a finite conjunction. ■ 

2.3 The Ehrenfeucht game 

Let A and B be structures over the same vocabulary with disjoint universes. The 
r-round Ehrenfeucht game on A and B, denoted by EuR r (A, B), is played by two 
players, Spoiler and Duplicator, with r pairwise distinct pebbles pi,...,p r , each 
given in duplicate. Spoiler starts the game. A round consists of a move of Spoiler 
followed by a move of Duplicator. In the s-th round Spoiler selects one of the struc- 
tures A or B and places p s on an element of this structure. In response Duplicator 
should place the other copy of p s on an element of the other structure. It is allowed 
to place more than one pebble on the same element. We will use a s (resp. b s ) to 
denote the element of A (resp. B) occupied by p s , irrespectively of who of the players 
places the pebble on this element. If after every of r rounds it is true that 



D(A) 
Di(A) 
1(A) 
h(A) 



max { D (A, B) 
max {Di(A,B) 
max { D (A, B) 
max {Di(A,B) 



B^A}, 
B?A}, 

B?A, \V(B)\ = \V(A)\}, 
B?A, \V(B)\ = \V(A)\}, 



di = a,j iff 



hi = bj for all i,j < s, 
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and the component-wise correspondence between (ai, . . . , a s ) and (b±, . . . , b s ) is a 
partial isomorphism from A to B, this is a win for Duplicator; Otherwise the winner 
is Spoiler. 

The l-alternation Ehrenfeucht game on A and B is a variant of the game in 
which Spoiler is allowed to switch from one structure to another at most / times 
during the game, i.e., in at most I rounds he can choose the structure other than 
that in the preceding round. 

The following statement provides us with a robust technical tool. 

Lemma 2.3 Let A and B be non-isomorphic structures over the same vocabulary. 

1) D (A, B) equals the minimum r such that Spoiler has a winning strategy in 
Eur t .(A,B). 

2) D[(A,B) equals the minimum r such that Spoiler has a winning strategy in 
the l-alternation EuR r (A, B). ■ 

We refer the reader to [2, Theorem 1.2.8], [8, Theorem 6.10], or [15, Theorem 2.3.1] 
for the proof of the first claim and to [12] for the second claim. 

3 Exploring structural properties of finite struc- 
tures 

3.1 A few useful relations 

Throughout this section we are given an arbitrary finite structure M over vocabulary 
L. We abbreviate V — V(M). 

Definition 3.1 For a,b e V we write a ~ b if the transposition (ab) is an auto- 
morphism of M. In other words, a ~ b if, for every /-ary relation R of M, we have 
Ra = Rat ® for all aeV 1 . 

Lemma 3.2 ~ is an equivalence relation on V. 

Proof. The relation is obviously reflexive and symmetric. The transitivity follows 
from the facts that the composition of automorphisms is an automorphism and that 
the transposition (ac) is decomposed into a composition of (ab) and (be). ■ 

Given X C V, we will denote its complement by X = V \ X. 

Definition 3.3 Let X C V and a,b G X. We write a =x b if idx extends to an 
isomorphism from M[XU{a}] to M[XU{6}]. In other words, for every /-ary relation 
R of M, we have Ra = Ra^ for all a £ (X U {a}) 1 . 

Furthermore, we write a b if the transposition (a, b) is an automorphism of 
M[X U {a, b}]. In other words, for every /-ary relation R of M, we have Ra = Ra^ 
for allae (XU{a,b}) 1 . 
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Clearly, a ~x b implies a =x b. It is also clear that =x is an equivalence relation 
on X. In contrast to this, simple examples show that a b is generally not an 
equivalence relation. 

Definition 3.4 C(X) is the partition of X into =x-equivalence classes. Further- 
more, C m (X) = {C E C(X) : \C\ < m}. 

The following lemma points some trivial but important properties of the partition 
C(X). 

Lemma 3.5 

1) IfX 1 CX 2 , then C(X 2 ) is a refinement of C{X X ) on ~X~ 2 . 

2) For any X , the ~ -equivalence classes restricted to X refine the partition 
C(X). 

In the sequel M' denotes another L-structure. 

Definition 3.6 Let : X — > X' be a partial isomorphism from M to M'. Let a G X 

and a' G X'. We write if extends to an isomorphism from M[X U {a}] to 

M'[X'U{a'}]. 

Lemma 3.7 Let : X — > X' be a partial isomorphism from M to M' . Then the 
following claims are true. 

1) Assume that a =x b and a' =x> b' . Then a a' iff b b' . 

2) Assume that a a' and b=$b' . Then a = x b iff a' = x > b'. 

3) Let be a partial isomorphism from M to M' which is an extension of 0. If 
a G dom0 \ X , then a =^ 0(a). 

4) Let be a partial isomorphism from M to M' which is an extension of 0. Let 
a, 6 G dom0 \ X. Then a =x & iff 0(a) =x> 4>{b). u 

The proof is easy. Item 1 of the lemma makes the following definition correct. 

Definition 3.8 Let : X — > X' be a partial isomorphism from M to M'. Let 
C G C(X) and C" G C(X'). We write C =^ C" if a =^ a' for some (equivalently, for 
all) a G C and a' G C". 
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3.2 A couple of useful transformations 

Let M be a finite structure of order n with the maximum relation arity k. Let 
X C V(M). We define two transformations that, if applicable to X, extend it to a 
larger set. 

Transformation T. If there exists a set S C X with at most fc — 1 elements such 
that |C(X U 5)] > |C(X)|, take the lexicographically first such S and set 
T(X) = XUS. Otherwise T is not applicable to X. 

Transformation E. Apply T iteratively as long as it is applicable. The result is 
denoted by E(X). In other words, E(X) = T^ n \X). If T is not applicable at 
all, set E{X) = X. 

Lemma 3.9 Assume that T is not applicable to X. If C G C(X) \ C 2 (X), then 
a ~x b for every a,b G C. 

Proof. Let C G C(X) and |C| > 3. Given a and b in C, we have to show that a^xb. 
In other words, our task is, given an Z-ary relation R of M and a G (X U {a, b}) 1 , 
to show that Ra = RoS ab ^ . If a contains no occurrence of a or no occurrence of 6, 
this equality is true because a =x b. It remains to consider the case that a contains 
occurrences of both a and b. 

Claim A. Let u, v, and w be pairwise distinct elements in C. Let R be an l- 
ary relation of M and u G (X U with occurrences of both u and v. Then 

i?M = iJuM. 

Proof of Claim. If i?-u 7^ RvS- vw \ then removal of w from C to X splits C into at 
least two =xu{«}- su bclasses, containing v and w respectively. This contradicts the 
assumption that T is not applicable to X. □ 

Let c be an arbitrary element in C \ {a, b}. Applying Claim A repeatedly three 
times, we obtain 

Ra = Ra (bc) = i^a^)^ = Rita^)^)^ = i^HXac) = m (ab) ^ 
as required. ■ 
Lemma 3.10 \E{X) \ X\ < (k - 1)\C(E(X)) \ C(X)\. - 

3.3 The many-layered base of a finite structure 

Definition 3.11 Suppose that a finite structure M with maximum relation arity k 
is given. For X C V(M), let Y{X) = \J C €C^(x) C. We set 

X = Y = 0, 

Xi = E(X i _ l U for 1 < i < k, 
Yi = Y(Xi) for 1 < i < k, 
Xk+i = Xk U Yk, 
Z = V(M)\X k+1 . 

We will call X k+ i the base of M. 
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An important role of the base of a finite structure is due to the following fact 
(cf. the more general Definition 5.3). 

Lemma 3.12 On Z the relations =x k , =x k+1 , and ~ coincide. 

Proof. We start with relations =x k and ~. Assume on the contrary that a = Xk b 
but a b for some a, b G Z. The latter means that, for some Z-ary relation R of M 
and d E V 1 with at least one occurrence of a, 

Ra {ab) ^ Ra. (6) 

Denote A = {a±, . . . , a{\ \ {a, b}. Since \ A\ < k — 1 and the Y^s are pairwise disjoint, 
there is j < k such that 

A n Yj = 0. (7) 

Remove all elements from A \ Xj to Xj and set Xj = Xj U A. Due to (6), this 
operation has the effect that 

a b. (8) 

j 

No class in C(Xj) can disappear completely: The classes in C h+l (Xj) can only split 
up because of (7), the classes in C(Xj) \ C k+1 (Xj) can lose up to k — 1 elements 
and/or split up. 

Since a =x k b and a, b G Z, both a and b belong to the same =x fe -class C* 
containing at least k + 2 elements. Let C be the =x,-class such that C* C C. We 
now show that C is split up after modifying Xj and therefore |C(Xj)| > |C(Xj)|, 
making a contradiction to the construction of Xj. 

Indeed, if a b, we have two subclasses containing respectively a and b. If 

i 

a =x'. b, it follows by Lemma 3.9 from (8) that the class in C(Xj) containing a and 
b is exactly {a, b}. After removing at most k — 1 elements, in C there remain at 
least 3 elements and therefore C must have at least one more =x> -subclass besides 

3 

{a,b}. 

Thus, on Z the relations = Xk and ~ are identical. By Item 1 of Lemma 3.5, on 
Z the relation =x k+1 refines =x k ■ By Item 2 of the same lemma the converse is also 
true. It follows that on Z the relations =x k+1 and =x k also coincide. ■ 

Lemma 3.13 Let n be the order of M and k be the maximum relation arity of M. 
We have 

+ f >^\~i ^>2 (9) 

and 

k—l 

2k ]T |C fc+1 (X,)| + (k + l)|C fc+1 (X fc )| + (k - l)\C(X k )\ + \Z\ > n + k - 1. (10) 

i=i 
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Proof. By Lemma 3.10 we have 

\X,\ < (k - l)(|C(Xi)| - 1), (11) 
{X^iX^UY^l < (k-l^CiX^l-lCiX^UY^l) (12) 

for 2 < i < k. Note that 

\C(Xi)\ = \C k +\X t )\ + \C(Xi) \C fc+1 (X,)| 

and 

|C(X,)\C fc+1 (X,)| < \C(XiUYj\ 

for 1 < i < k. The latter inequality is true because, according to Item 1 of Lemma 
3.5, the partition C(Xi U Yj) is a refinement of C(Xj) \ C k+1 (Xi). Combining it with 
(11) and (12), we obtain 

|*i| < (k-l)(\C k+1 (X 1 )\ + \C(X 1 UY 1 )\-l) (13) 

IxA^uy^oi < (A;-i)(|c fc+1 (x,)|+c(x l ur i )-|C(x^ 1 ur l _ 1 )|)(i4) 

Summing up (13) and (14) over all 2 < i < we have 

k / k \ 

\X,\ +£l*A (^-i U ^-i)| < (fc - 1) ]T|C fc+1 pQ)l + |C(X fc u n)l - 1) ■ (is) 

i=2 \i=l / 

According to Lemma 3.12, 

C(X k UY k )=C(X k )\C h+1 (X k ) (16) 

and, as a consequence, 

\C(X k UY k )\<\Z\/(k + 2). (17) 
From (15) we conclude, using (16), that 

\x,\ + E |x, \ u y^oi < (* - 1) fli |c fc+1 (X)| + |C(x fe )| - 1) (is) 

i=2 \i=l / 



and, using (17), that 

\Z\ 



\x,\ + £ pq \ (X^ U 3^)1 < (k - 1) (£ pq)l + 

i=2 \i=l 



k + 2 



(19) 



Notice also a trivial inequality 

|y|<(A: + l)|C fc+1 (X,)|. (20) 

It is easy to see that 

k k 

n = \X,\ +Y,\X t \ U y_0| + £ M + 1^1- (21) 

i=2 i=l 
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Using (18) and (20), we derive from (21) that 

k— 1 

n < 2k ]T |C fc+1 (X,)| + (fc + l)|C fc+1 (X fe )| + (k — l)\C(X k )\ + |Z| - (k - 1), 
i=i 

which implies (10). Using (19) and (20), we derive from (21) that 

n < 2k £ |C fe+1 pQ)l + (2 - |Z| - (k - 1), (22) 

which implies (9). ■ 

4 Identifying finite structures with smaller quan- 
tifier rank 

Theorem 4.1 Let L be a vocabulary with maximum relation arity k. For every 
L-structure M of order n we have 

U(M) < - n + k 2 - k + 2. 

The proof takes the next two subsections. The case of k = 1 is an easy exercise 
and we will assume that k > 2. According to Lemma 2.2, it suffices to consider 
an arbitrary L-structure M' non-isomorphic with M and of the same order n, and 
estimate the value of Di(M, M'). We will design a strategy enabling Spoiler to win 
the Ehrenfeucht game on M and M' in less than (1 — + k 2 — k + 2 moves with 
at most one alternation between the structures. This will give us the desired bound 
by Lemma 2.3. 



4.1 Spoiler's strategy 

The strategy splits play into k + 2 phases. Spoiler will play almost all the time in 
M, possibly with one alternation from M to M' at the end of the game. For each 
vertex v G V(M) selected by Spoiler up to Phase i, let (f>*(v) denote the vertex in 
V(M') selected in response by Duplicator. Thus, each subsequent extends 0*. 
Provided Phase i has been already finished but the game not yet, 0* is a partial 
isomorphism from M to M'. Under the same condition, it will be always the case 
that dom0* C Xj. We will use notation F;„i = dom0* n Yi-\. Recall that the sets 
Xi and Yi are defined by Definition 3.11 so that Yi-\ C Xj. 

Phase 1. 

Spoiler selects all vertices in X x . Let X[ = (j)\{Xi). 
End of phase description. 

Phase j + 1, 1 < j < k. 

Our description of Phase j+1 is based on the assumption that Phase j is complete 
but the game is not finished yet and that the following conditions are true for every 
l<i<j. 
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Condition 1. <p* has a unique extension 0j over the whole that is a partial iso- 
morphism from M to M' . Let X- = (f>i(Xi). 

Condition 2. There is a one-to-one correspondence between the partitions C k+1 (Xi_i) 
and C fc+1 (^_!) such that, if C" G C fc+1 (*i_i) corresponds to C G C fc+1 (Xi_i), 
then (7=^ C" and \C\ = \C'\. 

Condition 3. For every C G C k+1 (Xi_-i), 0* is defined on all but one elements of 
C. Denote C = dom0* n C. Then 4>*{C) C C, where C corresponds to C 
according to Condition 2. Furthermore, fa takes the single element in C \ C 
to the single element in C \ 4>*(C). Thus, &(C) = C 

For the further references we denote the set 4>i{Yi-i) = Y(X' i _ 1 ) by and 
its subset </>*(^-i) b y 

Condition 1 is true for % = 1 because is defined on the whole X 1 . For the sake of 
technical convenience, we set X' = Xq = 0. We suppose that n > k + 1 (otherwise 
Theorem 4.1 is trivially true). This implies that C k+1 (Xo) = C k+1 (X'o) = and 
makes Conditions 2 and 3 for % — 1 trivially true. For % > 1 Conditions 1-3 follow 
by induction from Claim C below. 

In the sequel we will intensively exploit the following notion. We say that a pair 
(a, a') G V(M) x V(M') is i-threatening (for Duplicator) if a and a' are selected by 
the players in the same round after Phase i and 

• a Xi or a' 

• o t^. a'. 

We now start description of the phase. It consists of two parts. 

Part 1. As long as no i-threatening pair arises for 1 < % < j, Spoiler selects all 
but one elements in each class C G C k+1 (Xj). The set of the vertices selected in C 
will be denoted by C . Furthermore, Spoiler selects all vertices in Xj + \ \ (Xj U Yj). 
As soon as an i-threatening pair for some 1 < % < j arises, Spoiler switches to the 
strategy given by Claim B below and wins in at most (i — l)(k — 1) moves. 

Part 2. Assume that Part 1 finishes and Duplicator still does not lose. Then, if 
Spoiler is able to win in at most k next moves irrespective of Duplicator's strategy, 
he does so and the game finishes. If he is not able to win but able in at most k 
moves to enforce creating an i-threatening pair for some % < j, he does so and wins 
in at most (i — l)(k — 1) subsequent moves using the strategy of Claim B. Otherwise 
Phase j + 1 is complete and the next Phase j + 2 starts. 

End of phase description. 

Claim A. Let % < k + 1. Suppose that Phase % is finished and Conditions 1-3 are 
met for % and all its preceding values. Assume that a G V(M) and a' G V(M') are 
selected by the players in the same round after Phase % and neither of them has been 
selected before. If 

• a G Xi but a' ^ 4>i(a) 
or 
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• a' G X[ but a ^ fa 1 (a 1 ), 
then the pair (a, a') is m-threatening for some m < i. 

Proof of Claim. Let m, 1 < m < i, be the largest index such that neither a G X m 
nor a' G X^. Then a G X m+1 or a' G X' m+1 . We consider the former case (the 
analysis of the latter case is symmetric). By Condition 3, a G Y m \ Y m and the 
relation a =^ m x holds for the only x = m+1 (a). We have a' ^ 4>i(a) = (p m+ i(a) 
(the latter equality is due to the uniqueness of the 0j's ensured by Condition 1). 
Therefore a ^ m a', which means that (a, a') is m-threatening. □ 

Claim B. Assume that Phase j, j < k + 1, finishes, Conditions 1-3 for all i < j 
are met, and the game is going on. Let 1 < % < j. As soon as after Phase j an 
i-threatening pair (a, a') arises, Spoiler is able to win in at most {i — l)(k — 1) moves 
playing all the time, at his own choice, either in M or in M'. 

Convention. Given a relation R = R M of M, we will denote the respective relation 
R M ' by R'. 

Proof of Claim. We proceed by induction on i. For i = 1 the claim easily follows 
from Item 3 of Lemma 3.7. Let i > 2 and assume that the claim is true for all 
preceding values 1, 2, . . . , % — 1. 

We focus on the case that a ^ (in the case that a' ^ X[ the proof is given by 
the symmetric argument). The non-equivalence a^^.a' can happen in two situations. 

Case 1: a' £ X[. Clearly, a ^ 0^ 1 ( a ') an d therefore, by Claim A, the pair (a, a') 
is m-threatening for some m < i. By the induction hypothesis, Spoiler is able to 
win in at most (m — l)(k — 1) moves. 

Case 2: a' (jz X[. Then the non-equivalence a ^± a 1 means that there is an Z-ary 
relation of M and a G (Xj U {a}) 1 with at least one occurrence of a such that 

Ra ^ R'-^a, (23) 

where ip is the map defined by i])(x) = (f>i(x) for all x in A = {ai, . . . ,a;} \ {a} 
and by ip(a) = a'. Thus, ip is not a partial isomorphism from M to M' . Hence, if 
A C dom0*, then Spoiler wins immediately. 

Assume that A — A \ dom <fi* is nonempty. Spoiler selects all unselected elements 
in A, if he wants to play in M, or in <pi(A), if he prefers to play in M' . This takes 
at most k — 1 moves. Suppose that Spoiler plays in M (for M' the argument is 
symmetric). If for every b G A its counterpart in V(M') is (fii(b), this is Spoiler's win 
by (23). If some b G A has the counterpart b' such that b' ^ 4>i(b), by Claim A there 
arises an m-threatening pair for some m < i. Applying the induction hypothesis 
for the index m, we conclude that Spoiler is able to win in at most (m — l)(k — 1) 
moves, having made altogether at most (k — 1) + (m — l)(k — 1) < (i — l)(k — 1) 
moves. □ 

Claim C. Assume that Phase j, j < k, has been finished and Conditions 1-3 for 
all % < j are met. Assume furthermore that Part 1 of Phase j + 1 finishes and the 
game is still going on. Then either Conditions 1-3 hold true for i = j + 1 as well or 
Spoiler is able to win or to create an i-threatening pair for some i < j in at most k 
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moves with at most one alternation from M to M' (and hence he is able to win in 
Part 2 of Phase j + l). 

Proof of Claim. Assuming that Spoiler is unable to win or to create an i-threatening 
pair, we check Conditions 1-3. 

Condition 2. The following two facts take place, for else Spoiler would be able 
to enforce creating a j-threatening pair in at most one move: 

• For every C E C k+1 (X'j) there is C E C k+1 (Xj) such that C = . C. (Oth- 
erwise Spoiler selects an elements in C that violates this condition and a 
j-threatening pair arises whatever Duplicator's response.) 

• For every C E C k+1 (Xj) there is C E C k+1 (X'j) such that C =+. C and 
\C\ > |C| — 1. (Otherwise, for some C, 4>j + i(C) cannot be included into the 
respective C . Therefore (fi*j +1 (c) for at least one c E C, providing us with 
a j-threatening pair.) 

Thus, there is a one-to-one correspondence between C k+1 (Xj) and C k+1 (Xj) such 
that, for C and C corresponding to one another, C =^ C, \C'\ > \C\ — 1, and 
4>j+i(C) C C. Moreover, it actually holds \C'\ = \C\ because, if \C'\ > \C\ + 1, 
Spoiler could select 2 vertices in C'\4>* +1 (C) obtaining a j-threatening pair whatever 
Duplicator's response. 

Conditions 1 and 3. By Condition 1 for % — j, the partial isomorphism can 
be extended on Xj only to <f)j and then it remains undefined within Xj + i only on 
Yj \Yj. Define an extension 4>j+\ of on the whole Xj + i so that <pj + i 

• agrees with (fij on Xj, 

• agrees with <p* +1 on Yj, and 

• for each C E C k+1 (Xj), takes the single element in C \ C to the single element 
in C \ 0* +1 (C), where C corresponds to C according to Condition 2 that we 
have already proved. 

We have to show that <f>j+i is a partial isomorphism from M to M' and no other 
extension of is such. 

Assume that 4>j+i is n °t a partial isomorphism and get a contradiction to the 
assumption that Spoiler can in the nearest k moves neither win nor create an i- 
threatening pair. For some /-ary relation of M and a E X l j+1 , we should have 

Ra ^ R'<f) j+l a. (24) 

As a consequence, A = {a±, . . . , a{\ is not included into dom0* +1 for else would 
not be a partial isomorphism, contradicting the assumption that the game is still 
going on. Let Spoiler select all elements in A = A\dom0* +1 . If for b E A Duplicator 
always responds with <pj + i(b), he loses by (24). Otherwise, let b be an element in 
A to which Duplicator responds with b' ^ <f>j + i(b). If b E Xj or b' E X'-, then we 
have b' ^ 4>j(b) because 4>j+i extends (fij, . By Claim A, (b,b') is an ^-threatening 
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pair for some i < j. If b G X j+1 \ Xj and b' G X'- +1 \ Xj, then b ^ b' by Condition 
2 proved above and the definition of <j>j+i. Thus, (b,b') is j-threatening. We have a 
contradiction in any case and therefore 4>j+i is a partial isomorphism from M to M' 
indeed. 

To prove the uniqueness of the extension <pj+i (i.e., Condition 1), assume that 
4>j+i is another extension of over Xj + i which is a partial isomorphism and 
differs from <f>j +1 at b G Yj \ Yj. Let b' = and 6" = <pj + i{b). By Condition 2 

proved above, 

6' £ x , b". (25) 

By Condition 1 for i — j, <j>j+i on Xj coincides with <pj. Thus, the composition 
takes 6' to 6", extends idxj, and is an automorphism of M'[X' j+1 ]. This 
makes a contradiction to (25). □ 

Claim C implies by an easy induction on j from 1 to k+1 that, for each 1 < j < k, 
unless Spoiler wins in Phase j or earlier, Conditions 1-3 assumed in our description 
of Phase j + 1 are indeed true. For analysis of the concluding phase, we state simple 
consequences of Claims A-C. 

Claim D. Suppose that Spoiler follows the strategy designed above (Duplicator's 
strategy does not matter). Assume that Duplicator survives up to Phase k + 1. 
Then the following claims are true. 

1) Conditions 1-3 hold true for alH < k + 1. 

2) When in further play Spoiler selects v G V(M)UV(M'), we denote Duplicator's 
response by ip(v). As long as there arises no ^-threatening pair for any % < k, 
it holds 

il>(v)=t h v if v£X k UX' k , (26) 
V>(u) = <Pk+i(v) if v G X k+1 , (27) 
1>(v) = if veX' k+1 . (28) 

(The relations in (26) and (27)-(28) are equivalent on (X k+1 U X' k+1 ) \ (X k U 

x' k ).) 

Proof of Claim. Item 1 follows from Claim C by an easy inductive argument. 
Regarding Item 2, note that, if (26) were false, (v,ip(v)) would be a /c-threatening 
pair. If (27) or (28) were false, (v,ip(v)) would be an i-threatening pair for some 
% < k on the account of Claim A. □ 

Concluding Phase (Phase k + 2). 

We here assume that Phases from 1 up to k + 1 have been finished without 
Spoiler's win and therefore Items 1 and 2 of Claim D hold true. As soon as there 
arises an ^-threatening pair for some % < k, Spoiler switches to the strategy given by 
Claim B and wins in at most (k — l) 2 moves. As long as there occurs no such pair, 
Spoiler follows the strategy described below. The strategy depends on which of the 
following three cases takes place. 
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Case 1: There is a one-to-one correspondence between C(X k ) and C(X' k ) such 
that, if C and C correspond to one another, then C =^ k C and, moreover, \C\ = 
\C'\. By Item 1 of Claim D, such correspondence does exist between C k+1 (X k ) and 
C k+l (X' k ) in any case. 

Let T be the set of maps : V(M') -> V(M) such that 

• is one-to-one, 

• extends 0^+!, 

• for every C G C(X' k ), we have 0(C") G C(X k ) and 0(C) =^ k C . 

Claim E. Assume that and -0 are in T. Let R be an Z-ary relation of M. Then 
i?0a' = Ri\)o! for all a! G V(M') 1 . 

Proof of Claim. The product -00 -1 is a permutation of V(M) that moves only 
elements in Z. Moreover, -00 -1 preserves the partition C(X k ) \ C k+1 (X k ) of Z 
and therefore -00 -1 is decomposed into the product of permutations nc over C G 
C(X k ) \ C k+1 (X k ), where each i\c acts on the respective C. Since every nc is 
decomposable into a product of transpositions, we have -00 -1 = tyt<i . . . r t with 
Ti being a transposition of two elements both in some C. It is easy to see that 
ipa! = (• • • ((0a') rt ) . . .) n . By Lemma 3.12, each application of r« does not change 
the initial value of R(pa'. Therewith we arrive at the desired equality R<f>a' = Ripa'. 
□ 

To specify Spoiler's strategy, we fix G T arbitrarily. Since M and M' are 
nonisomorphic, is not an isomorphism from M' to M, that is, 

R(pa' ^ R'a! (29) 

for some Z-ary relation R' of M' and a' G V(M') 1 . This inequality implies that the 
set A' = {a[, . . . , a'j} is not included into X' k+l . Spoiler selects, one by one, elements 
of A = A' \ range <fik+i- ■^' or Spoiler's move v, let ip{v) denote Duplicator's response. 
Assume first that 

ipiv) =^ k v whenever v ^ X' k 

and (30) 
$(v) = 0fe+i(v) whenever v G X k+1 . 

Due to (30), we are able to extend -0, initially defined on A, to a map in T. Fix 
a such extension. By Claim E, R<po! = Ripa! and, by (29), Spoiler wins. If (30) 
is violated for some v G A, by Item 2 of Claim D this produces an ^-threatening 
pair for % < k and therefore Spoiler wins in at most (k — l) 2 moves, having made 
altogether at most k + {k — l) 2 moves. 

Case 2: There is no one-to-one correspondence between C(X k ) and C(X k ) such 
that, if C and C correspond to one another, then C =^ k C . Spoiler selects an 
element in C or C that has no counterpart. If Duplicator responds with a vertex 
outside X k Ul[, there arises a /c-threatening pair. If Duplicator responds with a 
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vertex in X k U X' k , there arises an i-threatening pair for i < k by Item 2 of Claim 
D. This allows Spoiler to win altogether in at most 1 + (k — l) 2 moves. 

Case 3: There is a one-to-one correspondence between C(X k ) and C(X k ) such 
that, if C and C correspond to one another, then C = ( j >k C . However, there are 
C E C(X k ) and C G C(X' k ) such that C =^ fc C but \C\ ± \C'\. 

Call a class C E C(X k ) useful if C=^ k C but \C\ ^ \C'\. The description of Case 
3 tells us that there is at least one useful class. Actually, since |F(M)| = |V(M')|, 
there are at least two useful classes, C\ and C 2 . Note that |Ci| + |C 2 | < \Z\. Without 
loss of generality, assume that \C\\ < \Z\/2. Let C[ be the counterpart of C\ in 
C(X' k ), i.e., C\ =^ k C[. In the larger of C\ and C[ Spoiler selects min{|Ci|, \C[\} + 1 
elements. Duplicator is enforced to at least once reply not in the smaller class. By 
Item 2 of Claim D, this produces an i-threatening pair and Spoiler, according to 
Claim B, wins in at most (k — l) 2 subsequent moves, having made altogether at 
most \Z\/2 + 1 + (k - l) 2 moves. 

End of description of the concluding phase 

4.2 Estimation of the length of the game 

If Spoiler follows the above strategy and Duplicator delays his loss as long as possible, 
the end of the game is always this: Spoiler enforces creating a threatening pair in 
at most k moves and then wins in at most (k — l) 2 next moves using the strategy 
of Claim B. Let us calculate the smallest possible (optimal for Duplicator) number 
of elements in M unoccupied till such final stage of the game. The minimum is 
attained if all Phases from 1 up to k + 2 are played and in Phase k + 2 it happens 
Case 3. Then the number of elements unoccupied in X k +i is equal to 

i:\Yt\n =j:\c k+i (x t )\. 
i=i i=i 

The number of elements unoccupied in Z is at least \Z\ — (\Z\/2 + 1) = \Z\/2 — 1. 
By Lemma 3.13, the total number of unoccupied elements is at least 

tr i(x ^- i> T k -\-h 

Thus, the maximum possible number of occupied elements is less than 

( l -2^) n+l 2 + Tk- 
Summing up, we conclude that our strategy allows Spoiler to win in less that 

moves. Theorem 4.1 is proved. 
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4.3 Definability results 



A natural question is if our approach applies to defining rather than identifying 
formulas. In fact, the proof of Theorem 4.1 implies the definability with lower 
quantifier rank for a quite representative class of structures. 

4.3.1 Definability of irredundant structures 

Definition 4.2 If M is a finite structure, let 

cr(M) = max { \A\ : AC V(M) such that a\ ~ a,2 for every a±,a2 G A} 

be the maximum cardinality of a ^-equivalence class in V(M). 

If a(M) = 1, i.e., no transposition of two elements is an automorphism of M, we 
call M irredundant. 

Theorem 4.3 Let M be an irredundant structure of order n with maximum relation 
arity k. Then 



Proof. It is not hard to see that an irredundant structure whose all relations 
are unary is definable by a formula with quantifier rank 1. Assume therefore that 
k > 2. Notice that Spoiler's strategy described in Section 4.1 applies for any pair 
of L-structures M and M' of arbitrary orders with the only exception of Case 3 
in the concluding Phase k + 2, where the equality |V(M)| = |V(M')| is supposed. 
Since the set Z is partitioned into ~-equivalence classes each consisting of at least 
k + 2 elements, for an irredundant structure M we have Z — 0. Consequently, 
V(M) = Xk+i- It follows that either Spoiler wins at latest in Phase k + 1 or, 
according to Item 1 of Claim D, there is a partial isomorphism from M to M' 
with dom0 fc +i = V(M). 

In the latter case, since M and M 1 are non-isomorphic, there is at least one 
element v G V(M') \ range 4>k+i- In the concluding phase of the game Spoiler selects 
v and, according to Item 2 of Claim D, there arises a /c-threatening pair. Spoiler 
switches to the strategy given by Claim B and wins in at most (k — l) 2 moves. 

It remains to estimate the length of the game. Similarly to Section 4.2, we 
conclude that Spoiler needs at most n — Z)f =1 \C k+1 (Xi) \ + k + (k — l) 2 to win. By 
estimate (22), where \Z\ = 0, this number is less than (1 — ^£)n + k 2 — k + 1 . ■ 

Remark 4.4 There are simple examples of irredundant structures M showing a 
lower bound D (M) > n/4. For example, let F be a directed graph on two vertices 
u and v consisting of a single (directed) edge (uv). Let G be another directed 
graph on u and v consisting of two edges, (uv) and the loop (uu). Denote the 
disjoint union of a copies of F and b copies of G by aF + bG. It is easy to see 
that aF + bG is irredundant for any a and b. Directed graphs M = mF + mG 
and M' — (m — \)F + (m + l)G are non-isomorphic and both have order 4m. An 
obvious strategy for Duplicator in the Ehrenfeucht game on M and M' shows that 




D (M, M') > m. 
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Theorem 4.3 will be considerably strengthened in the next subsections. In par- 
ticular, it will be surpassed by Theorem 4.11. 

4.3.2 A further refinement 

As we observed in the proof of Theorem 4.3, Spoiler's strategy designed in Section 
4.1 ensures the bound 



for M' of any order under an additional condition imposed on M. We are able to 
describe exceptional pairs of non-isomorphic M and M' for which (31) may not hold 
much more precisely. Assume that M' has order n' > n. As was already mentioned, 
the assumption that n' = n is used only in Case 3 of the concluding Phase k + 2. 
Turning back to this case, we see that what is actually used is the existence of at 
least two useful classes in C(X k ). Thus, (31) may not hold in the only case that 
there is a unique useful class Co G C(Xk). Since actually Co G C(Xk) \ C k+1 (Xk), we 
have \Cq\ > k + 2. By Lemma 3.12, the class Co consists of pairwise ~-equivalent 
elements. 

Let C'o be the counterpart of Co in C(X' k ), i.e., C =0 fe Co. Given B C Cq 
with \B\ = | C |, let M' B = M'[V{M') \ (C \ B)]. Consider an arbitrary map 
: V(M' B ) -> V(M) extending 0^, mapping each C G C(X' k ) \ {C } onto its 
=^ fc -counterpart in C(X/ ! ), and mapping B onto Co- As in Case 1 of Phase k + 2, we 
see that Spoiler is able to win within the bound of (31) unless <fi is an isomorphism 
from M' B to M. From here we easily arrive at the following conclusion. 

Lemma 4.5 Let L be a vocabulary with maximum relation arity k. Let M and M' 
be non-isomorphic L-structures of orders n and n' respectively and n < n'. Then 
the bound 



may be false only if there is a set Co C V(M) with \Cq\ > k + 2 consisting of pairwise 
^-equivalent vertices and there is a partial isomorphism ijj from M to M' dehned on 
V(M) \ Co whose any infective extension is a partial isomorphism from M to M' . 

In the next subsection we make a constructive interpretation of the condition 
appearing in the lemma. 

4.3.3 Cloning an element of a structure 

Notation. Recall that, given a set V and a function n defined on V, we extend it 
over V 1 , where I > 1, by iru — (ir(ui), . . . , 7r(i^)) for any u = (u±, . . . ,ui) with all Ui 
in V . In particular, this concerns the case that 7r is a permutation of elements of V . 
Recall also that, if it — (viv 2 ) is a transposition, then we may write u^ 1 ^ in place 




(31) 




of 7TU. 
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Definition 4.6 Given v E V(M), let [v] M = {u E V(M) : u ~ v} be the ^equiva- 
lence class of the element v. 

We now introduce an operation of expanding a class [v]m, i-e., adding to M new 
elements ~-equivalent to v. This operation was considered in [14] in the particular 
case of uniform hypergraphs. 

Let L be a vocabulary with maximum relation arity k. Below K and M are 
//-structures, v is an element of M, and t is a non-negative integer. 

Definition A The notation K = M @tv means that the following conditions are 
fulfilled. 

Al V(M) C V{K) and \V{K)\ = \V(M)\+t. 
A2 K[V(M)} = M. 
A3 \[v] M \ > k. 

A4 [v] K = [v] M U(V(K)\V(M)). 

Definition B The notation K = M © tv means that the following conditions are 
fulfilled. 

Bl V(M) C V(K) and = |V(M)| +t. 

B2 There is C C [v] M with |C| > k such that every injective extension of idy(M)\c 
to a map ip : V^(M) — > V(K) is a partial isomorphism from M to if. 

Definition C The notation K = M ®tv means that the following conditions are 
fulfilled. 

CI V(M) C and \V(K)\ = \V(M)\+t. 

C2 \[v] M \ > k. 

C3 Let R be an /-ary relation in L. If u E V(M)\ then R K u = R M u. 

C4 Let R be an /-ary relation in L. Assume that u E V(K) 1 and the set {u-y, . . . , ui}\ 
V(M) = {w\, . . . ,w p } is nonempty. Then R K u = 1 iff there are pairwise 
distinct elements vi,...,v p E [v]m \ {u\, ...,ui} such that R M 7iu = 1 for 

71 = {WiVi) ■ ■ ■ {WpVp). 

Lemma 4.7 Definitions A, B, and C are equivalent. 

Proof. Conditions A1-A4 imply Conditions B1-B2. Since Bl coincides with 
Al, we only have to derive B2. We are actually able to prove B2 for an arbitrary 
C C [v] M with \C\ > k (there is at least one such C by A3). Let ip be as specified 
in B2. For any /-ary relation R in L and u E V(M) 1 , we have to check that R M u = 
R K tpu. Assume that in {tp(ui), . . . , ^>(ui)} there are p elements from V(K) \ V(M) 
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and denote them by w±, . . . , w p . Take arbitrary pairwise distinct v i, . . . , v p G C \ 
{ip(ui), . . . , ip{ui)}. Let u = nipu with n = [w\v i) • • • (w p v p ). By A4, we have 
Vi ~ Wi in K for all i < p. It follows that R K ipu = R K u. Since u G V(M)', by A2 
we have i?^-u = R M u. Notice now that u and u coincide at the positions occupied 
by elements in V(M) \ C, while elements in C are permuted according to some 
permutation r, i.e., u = ru. Since r is decomposable in a product of transpositions 
and elements of C are pairwise ~-equivalent in M, we have R M u = R M u, completing 
derivation of B2. 

Conditions B1-B2 imply Conditions C1-C4- For CI and C2 this is trivial. C3 
immediately follows from B2 if we take ip = idy(M)- Let us focus on C4. Let u 
and Wi , . . . , w p be as specified in this condition. Assume first that R K u = 1. Take 
Vi, . . . ,v p G C \ {ui, . . . ,ui} being pairwise distinct and define ip by ip{vi) = Wi 
for % < p and tp(x) = x for all other x G V(M). Notice that ^~ l u = iru for 
7r = (wiVi) • • • (wpVp) . As ijj is a partial isomorphism by B2, we conclude that 
R M nu = R K u = 1. This proves C4 in one direction. Such a way of proving 
R M 7ru = R K u will be referred to as ip -argument. 

For the other direction, assume that R M nu = 1 for -n = (wiVi) ■ ■ ■ (w p v p ) with 
some v±, . . . , v p G [v]m \ {ui, ■ ■ ■ , Ui}- If all v% are in C, the equality R K u = 1 follows 
from the -^-argument with the same ip as above. Otherwise, we can replace each Vi 
with some v[ G C, where v[, . . . , v' p are pairwise distinct elements of C \ {ui, . . . , u{\ 
and v[ = Vi whenever Vi G C . For no % this replacement changes the initial value 
of R M nu and, after all replacements are done, we have R M 7i'u = 1 with n' = 
(wiv[) ■ ■ ■ (wpV p ). Defining ip' by ip'iv'^ = uii and ip'(x) = x elsewhere on V(M), we 
obtain R K u = R M ir'u = 1 by the -^'-argument. 

Conditions C1-C4 imply Conditions A1-A4- Since A1-A3 are virtually the same 
as C1-C3, our concern is A4. It is easy to see that [v]k H V(M) cannot be larger 
than [v]m- Therefore, it suffices to show that in K we have v ~ v' for any v' G 
[v] M U (V(K) \ V(M)). Given an Z-ary relation R in L and u G V{K) 1 , we have to 
check that 

R K u = R K uW. 

We do it by routine examination of several cases. Note that, if neither v nor v' 
occurs in u, then there is nothing to prove. 
To simplify notation, denote 

u = 

Furthermore, let U = {u 1: . . . , u{\ and U \ V(M) = {w 1: . . . , w p }. Denote the set of 
elements in u by U . 

Case 1: v' G V(K)\V(M). 

Subcase 1.1: v G U, v ' G U. 
Assuming R K u = 1, we will infer R K u = 1. This will give also the converse implica- 
tion because u is supposed arbitrary with occurrences of both v and v' and we hence 
can take u instead of u. Without loss of generality, assume that v' = w p . By C4, 
there are v±, . . . , v p G [v\m \ U such that R M nu with 7r = (wiVi) ■ ■ ■ (w p -iv p -i)(v'v p ). 
As easily seen, ttu = (nu)( VVp \ Since v p ~ v in M, we have R M ixu = 1. Note that 
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U = U and hence v±, . . . ,v p G [v]m\U. By C4, we conclude that R K u = 1, as 
desired. 

Subcase 1.2: v eU, v' £ U. 
Note that £> \ V(M) = {wi, ...,w p , v'} and [v\ M \ U = (Mm \U)U {v}. We first 
assume that R K u = 1 and infer from here that R K u = 1. Let vi,...,v p be as 
ensured by Condition C4 for -u, that is, R M nu = 1 with n = (wiVi) ■ ■ ■ (w p v p ). Let 
7r' = 7r(iA;). As easily seen, 7r'-u = nu. Thus, R m tt'u = R M nu = 1 and, by C4, we 
conclude that = 1. 

We now assume that R K u = 1 and have to infer R K u = 1. According to C4, 
there are pairwise distinct i^, . . . ,v' p+1 G [i>]m \ U such that R M n'u = 1 with 7r' = 
(wiv[) ■ ■ ■ (w p v p )(v'v' p+1 ). Choose pairwise distinct v 1 , . . . , v p in {v[, . . . , \ {f } 
and apply to u the substitution n = (w^i) ■ ■ ■ (w p v p ). It is not hard to see that 
7tu = tti'u for r being a permutation of the set V = {v, v [, . . . , v' v' +1 } taking v[ to 
Vi for % < p and v' p+1 to v. A such r exists because elements in {v[, . . . , v' p+1 } and in 
{f i, . . . , v p , v} are pairwise distinct (the fact that the two sets may intersect does not 
matter). Since r is decomposable in a product of transpositions of two elements from 
V and elements in V are pairwise ~-equivalent in M, we have R M nu = R M n'u = 1. 
By C4, we conclude that R K u = 1, as desired. 

Subcase 1.3: v £ U, v' G U. 
This subcase reduces to Subcase 1.2 by considering u in place of -u. 

Case 2: v' G [u]m- 

Since in this case v and v' are interchangeable, it suffices to assume that v G U and 
prove that R K u = 1 implies .R^m = 1. Note that U \ V(M) = {w x , . . . , w p }. 
Subcase 2.1: v' G U. 

Note that [v]m\U = [v]m \ U. Let v\, . . . , v p G [i> ]m \ ^ be as ensured by Condition 
C4 for u, i.e., R M nu = 1 with 7r = (wiVi) ■ ■ ■ (w p v p ). Applying the same n to u, we 
see that nu = (nu)( vv '\ As v ~ v' in M, we have R m tcu = R M nu = 1 and hence, 
by C4, we obtain R K u = 1. 
Subcase 2.2: v' £ U. 

Note that [f ]m \ U = ((Hm \ U) \ {V}) U {w}. Let i>i, . . . , i> p and 7r be as in Subcase 
2.1. The difference is that now the containment t>' G {t>i, . . . , v p } is possible. For 

i < p, set 

v > = (Vi if 7^ 
I v if f j = v' 

and apply to u the substitution 7r' = (wiv[) ■ ■ ■ (w p v' p ). It is not hard to see that 
tt'u = T7iu for r being a permutation of the set {v, v±, . . . , t> p , i> '} taking f j to v[ for 
all i < p and t> to f'. Similarly to the second part of Subcase 1.2, we conclude that 
R m tt'u = R m tcu = 1 and, by C4, we obtain R K u = 1. - 

Lemma 4.8 Let L be a vocabulary with maximum relation arity k. Let M be an 
L-structure, v G V(M) with \[v]m\ > k, and t > 0. Then an L-structure K such 
that K = M ®tv exists and is unique up to an isomorphism. 

Proof. The existence follows from Definition C. To obtain K, we add t new 
elements to V(M), keep all relations of M on V(M), and add new relations involving 
at least one new element, being guided by Condition C4. 
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To prove the uniqueness, we use Definition B. Assume that K 1 — M © tv and 
K 2 = M © tv according to this definition. Let : V(Ki) — * V(K 2 ) be an arbitrary 
one-to-one map whose restriction on V(M) is idv(M)- We claim that is an iso- 
morphism from K\ to K 2 . Given an /-ary relation R in L and u G V{K\)\ we have 
to check that R Kl u = R K2 (f)u. The case that u G V(M)' is trivial. Suppose that 
{ui, . . . , ui} \ V(M) = {wi, . . . , w p } is nonempty. Note that {0(wi), . . . , 4>{ui)} \ 
V(M) = {<t>( Wl ), and {«!, ...,«,} n \/(M) = {0( Ml ), • • • , 0(«/)} n \/(M). 
Let i>i, . . . ,v p be pairwise distinct elements in C that do not occur in u and hence 
in (ftu. Define ip\ by ipi(vi) = Wi for i < p and ipi(x) = x for all other a; in V(M). 
Define ip2 similarly with the difference that ip2{vi) = <fi{wi) for % < p. Obviously, 
■0 2 ~ 1 0w = tpi 1 u. By B2, -01 and 02 are partial isomorphisms from M to K\ and X2 
respectively. Therefore 



R K *u = R M ^ l u = j R m 2 " 1 0m = R 



U. 



The proof is complete. ■ 
With using Definition B, the following lemma is a direct consequence of Lemma 

4.5. 

Lemma 4.9 Let L be a vocabulary with maximum relation arity k. Let M and M' 
be non-isomorphic L-structures of orders n and ml respectively and n < n' . Then 
the bound 

Dx(M, M') < (l - n + k 2 - k + 2 

may be false only if M' = M* © (n' — n)v for some structure M* isomorphic with 
M and v G V(M*). 

4.3.4 An upper bound for D (M) 

The following result was obtained in [14] for graphs with the proof easily adaptable 
for any structures (see Lemma 4.2 and Remark 4.9 in [14]). 

Lemma 4.10 ([14]) Let M be a structure of order n with maximum relation arity 
k, v be an element of M with \[v]m\ — s > k, and M' = M ®tv with t > 1. Then 

s + 1 < D (M, M') < Di(M, M') < s + k - 1 + Vill, 

s ~ \~ 1 

Putting Lemmas 4.9 and 4.10 together, we immediately obtain an upper bound 
for D (M). Recall that <r(M) = max^^fM) IMm|- 

Theorem 4.11 For a structure M of order n with maximum relation arity k, we 
have 

Di(M) < max | ^1 - — ) n + k 2 - k + 2, cr(M) + fcj . 
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Proof. Given M, let us summarize upper bounds we have for D^M, M') for various 
M' non-isomorphic with M. Denote 



u k , n = (l - n + k 2 - k + 2 and f(s) = 



7 1 U 

s + k - 1 + 



s + 1 

If M' = M* © tv for M* an isomorphic copy of M, then 

Dx(M,M0< max /(a) (32) 

l<s<a(M) 

by Lemma 4.10. Similarly, if M — M* © tv for M* an isomorphic copy of M', then 

Dx(M,M')< max /(s), 

which is within the bound (32) because in this case a(M') < a(M). For all other 
M' we have 

Di(M, M') < Uk, n 

by Lemma 4.9. 
Notice now that 

max /( S )<max{/(1),/KM))}. 

l<s<a(M) 



Furthermore, 



f(a(M))<( m if ^W^^- 1 )/ 2 . 



and /(l) < Uk, n - Summing up, we conclude that 

imxDi(M, M') < max{M fcjTt , <r(M) + fc}. 

By Lemma 2.2, the proof is complete. ■ 

Note that, given M, the number a(M) is efficiently computable in the sense that 
computing a(M) reduces to verification if a transposition is an automorphism of 
the structure. Thus, Theorem 4.11 provides an efficiently computable non-trivial 
upper bound for Di(M), whereas it seems plausible that the exact value of D (M) 
is incomputable. 

We also can restate the obtained bounds as a dichotomy result telling us that 
either we have the bound Di(M, M') < (1 - ±)n + k 2 - k + 2 or else M has a 
simple, easily recognizable property and, moreover, for all such exceptional M we 
are able to easily compute D (M) within an additive constant. Results of this sort 
are obtained in [14] for structures with maximum relation arity 2 and /c-uniform 
hypergraphs. 



Theorem 4.12 Let M be a structure of order n with maximum relation arity k. If 

2k, 



a(M)<[l-±-)n + (k-l) 2 + l, (33) 
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we have 

D 1 (M)<(l-—^jn + k 2 -k + 2. (34) 

Otherwise we have 

<j(M) + 1 < D (M) < Di(M) < a(M) + k. (35) 

Proof. If the condition (33) is met, the bound (34) follows directly from Theorem 
4.11. If (33) does not hold, the upper bound in (35) again follows from Theorem 4.11. 
The lower bound in (35) follows from Lemma 4.10 as D (M) > D (M, M © lv) > 
a(M) + 1, where v G V(M) is such that \[v] M \ = cr(M) and hence \ [v] M \ > k. ■ 



5 Identifying finite structures by Bernays-Schon- 
finkel formulas 

Theorem 5.1 Let L be a vocabulary with maximum relation arity k 
L-structure of order n, then 

BS(M)<(l-^)» + *. 

If k = 1, a stronger bound BS (M) < n/2 + 1 holds true. 

The case of k — 1 is easy and included for the sake of completeness. The upper 
bound of n/2 + 1 matches, up to an additive constant of 1, a simple lower bound 
of n/2 attainable by structures with a single unary relation. The proof of Theorem 
5.1 for the case that k > 2 takes the rest of this section. 

5.1 Notation 

In addition to the notation introduced in Section 2.1, we will denote [k] = {1,2,..., 
k}. If z — (zi, . . . ,Zi) and r is a map from [k] to [I], then z T = (z T ^, . . . , z T (^y). 

Recall that, given a partial isomorphism : X — > X' from an L-structure M to 
another L-structure M', we have defined a relation =^ between elements in X and 
elements in X' (see Definition 3.6). Definition 3.8 extends this relation over classes 
in C(X) and C(X'). We will need yet another extension of =x over subsets of X and 
X' . Let (7CI and U' C X'. We will write U=<f, U' if <fi extends to an isomorphism 
from M[XUU] to M'[X' U U']. 

We define BS g (M) similarly to BS (M) with the only additional requirement that 
an identifying Bernays-Schonfinkel formula has at most q universal quantifiers. It is 
clear that BS (M) < BS 9+1 (M) < BS,(M). 



. If M is an 

(36) 
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5.2 A couple of useful formulas 



If x 



(xi, . . . , xi) is a sequence of variables, let 



Dist(x) 



f\ %i 7^ Xj- 



i<«<i<' 



Let M be a finite structure over vocabulary L and a be a sequence of I pair- 
wise distinct elements of V(M). Then it is easy to construct a first order formula 
Isom,o,(xi, ■ ■ ■ , xi) such that, for every L-structure M' and a' G V(M') 1 , M',d' \= 
Iso M,a(%) iff the component-wise correspondence between a and a' is a partial iso- 
morphism between M and M'. Specifically, assume that L = . . . , i? m ), where 
i?i has arity fcj. Then 



5.3 The first way of identification 

In this section we will exploit the relation ~ on V(M) defined in Section 3.1 and 
the invariant a(M) introduced in Definition 4.2. 

Proposition 5.2 Let L he a vocabulary with maximum relation arity k. For every 
L-structure M of order n, we have 



Proof. Suppose that cr(M) = k + d with d > 1 (if a(M) < k, the proposition is 
trivial). Let A be a ~-equivalence class of elements of V(M) such that \A\ = cr(M). 
Denote B = A and fix orderings A = {ai, . . . , dk+d} and B — {b±, . . . , 6 n -fc-d}- Set 
a = (ai, . . . , Ofc). We suggest the following formula $m to identify M: 



Claim A. M' |= $m iff there is a partial isomorphism : B — > 5' from M to M' 
such that every injective extension of over £>U{ai, . . . , a^} is a partial isomorphism 
from M to M' . 

Claim B. M(=$j|f. 

Proof of Claim. On the account of Claim A, it suffices to show that the extension 
of ids by 0(ai) = a^, . . . , 0(afc) = a ifc , where i±, . . . , is an arbitrary sequence of 
pairwise distinct indices in [/c + of], is a partial automorphism of M. This follows from 
the fact that every permutation of A, in particular, that taking each a, for j < k to 




r 



BS fc (M) < n + k-a(M). 



$ M = . . . 3y n _ fc _ d Va;i . . . Vx k ty M (y, x), 



where 



*m(2/,z) = Iso Mj5 (y) A (Dist(y,x) -> Iso M ^(?/, x)) . 
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a iv is decomposed into a product of transpositions (a p a q ) with l<p<q<k + d. 
(Recall that the latter are automorphisms of M). □ 

Claim C. If an L-structure M' has order n and M' |= $ M , then M and M' are 
isomorphic. 

Proof of Claim. Let and B' be as in Claim A. Fix an ordering a[, . . . , a' k+d of 
the set A' = V(M') \ B' . According to Claim A, for every sequence i\, . . . ,ik of 
pairwise distinct indices in [k + d], the extension of by <f>(aj) = a\. for j < k is a 
partial isomorphism from M to M' . From the proof of Claim B we know an analog 
of this fact for M itself: for every sequence i±, . . . ,4 of pairwise distinct indices in 
[k + d], the extension ip of ids by ip(dj) = for j < k is a partial automorphism 
of M. It follows, in particular, that for every sequence 1 < %\ < . . . < %^ < k + d, 
the extension of by ^(a^) = a\. for j < k is a partial isomorphism from M to M'. 
Extend over the whole V(M) by 0(aj) = a\ for all i < k + d. We conclude that the 
restriction of on every /c-element subset of V(M) is a partial isomorphism from 
M to M'. Since every relation of M has arity at most k, is an isomorphism from 
M to M'. □ ■ 

5.4 The second way of identification 

Definition 5.3 A set B C \/(M) is called a frase of a structure M if the rela- 
tions =b and ~ coincide on B. The fineness of a base i? is defined by f(B) = 
max {\C\: C E C{B)}. Furthermore, let p{B) = \B\ + max{/( J B) + 1, k}. 
We define p(M) to be the minimum p(-B) over all bases B of M. 

Proposition 5.4 BS (M) < p(M). 

Proof. Given a base B of M, we construct a Bernays-Schonfinkel formula $m 
with p(B) quantifiers that identifies M. Let p = \B\ and q = max{/(5) + l,k}. 
Assume that p + q < n for otherwise we are done. Denote A = B and fix orderings 
A = {cli, . . . , a„_ p } and = {b±, . . . , We set 

$ M = 3yi . . . 3y p Vxi . . . Vx q ^ M (y, x), 

where 

*Af (j/, x) = lso M - b (y) A ^Dist(y, x) -»• V Iso M,fe,a- (j/, • 

r:[g]-»[n-p] 
T is injcctivo 

Claim A. Let M' be another L-structure, 6' = . . . , b' p ) be a sequence of elements 
of V(M'), and A' = V(M') \ {b[, b' p }. Then M' , V |= Vxi . . . Vx 9 * M (y, x) holds 
iff 

• the component- wise correspondence between 6 and b' is a partial isomorphism 
from M to M' and 
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• for every U' C A' with at most q elements there is a U C A such that U U' . 

The proof is fairly obvious. The claim immediately implies that M \= $m- 
Claim B. If M' |= $m and M' has order n, then M and M' are isomorphic. 
Proof of Claim. Let b' — (b[, . . . , b' p ) be such that 

M\V hVx!...V^ M (y,x). 

Set 5' = {6^, . . . , b' p }. By the definition of \I>m, there is a partial isomorphism 
<p : B ^ B' from M to M'. By Claim A, for every a' G A' there is a £ A such that 
a =^ a'. Hence for every C G C(B') there is C G C(-B) such that C=$C. Moreover, 
for every C G C(B') and the respective C G C(5) it holds \C\ > \C'\ (if |C"| > |C], 
then for any (|C| + l)-element set [/' C C' the second condition in Claim A fails). 
Since \A\ = \A'\ or, in other terms, J2ceC(B) |C| — Sc'eC(B') |C'| , for every C" it 
actually holds the equality |C| = |C"|. Thus, we have a one-to-one correspondence 
between C(B) and C(-B') such that, if C G C(-B) and C G C(-B') correspond to one 
another, then C C and |C| = |C"|. 

We are now prepared to exhibit an isomorphism from M' to M. Fix an arbitrary 
extension tp of to a one-to-one map from V(M') to V(M) taking each C to the 
respective C. We will show that ip is an isomorphism. Let R' be an /-ary relation of 
M' and R be the respective relation of M. Given an arbitrary /-tuple u' G V(M') 1 , 
we have to prove that 

Ripv! = R'u'. (37) 

Denote U' = {u[, . . . ,u[}. Let ipw be the extension of to a partial isomor- 
phism from M' to M with [/' C dovcnpjji whose existence is guaranteed by Claim 
A. We have 

Rip w u' = R'u'. 
To prove (37), it suffices to prove that 

Rt/tu'u' = Ripu'. (38) 

We proceed similarly to the proof of Claim E in Section 4.1. By Item 3 of 
Lemma 3.7, the partial map tpw takes an element in a class C to an element in 
the respective class C. Suppose that tpu' is extended over the whole V(M') with 
the latter condition obeyed. Since both ipjji and ip extend the product ipu'ip -1 
moves only elements in A. Since both ip and ipu' take an element in a class C 
to an element in the respective class C, the map ipjjitp" 1 preserves the partition 
C(B) of A. It follows that ipu^' 1 is decomposed into the product of permutations 
n c over C G C(B), where each n c acts on the respective C. Since every tc c is 
decomposable into a product of transpositions, we have ipu'ip -1 = t\T2 ■ ■ - n with 
Tj being a transposition of two elements both in some C. It is easy to see that 
ipu'u' = (. . . ((ipu') Tt ) . . .) n . By Lemma 3.12, each application of r, does not change 
the initial value of Ripa! . Therewith (38) is proved. □ ■ 

Remark 5.5 One can show that p(M) provides us with an upper bound not only 
for BS (M) but also for D^M). 
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5.5 The third way of identification 

Yet another way of identification that we here suggest is actually not new being a 
specification of Proposition 5.4 in the preceding section. 

Definition 5.6 If M is a finite structure, let 

S(M) = max{ \ A\ : A C V(M) such that a\ ^-ja 2 for every a±, a 2 E A} . 

It is not hard to see that, in other terms, 5(M) = maxxcv(M) I^POI- 

Proposition 5.7 Let L be a vocabulary with maximum relation arity k > 2. For 
every L-structure M of order n, we have 

BS fc (M) < n + k-5{M). 

Proof. As easily seen, if A C V(M) is such that a x a 2 for every ai,a 2 € A, 
then A = V(M) \ A is a base of M with fineness f(A) = 1. Since k > 2, we have 
m&x{f(A) + l,k} — k and therefore p(M) < n + k — 5(M). Thus, the proposition 
directly follows from Proposition 5.4. We only have to note that the identifying 
formula constructed in the proof of Proposition 5.4 has max{f(A) + l,k} = k 
universal quantifiers. ■ 



5.6 Putting it together 

We now complete the proof of Theorem 5.1. Assume that k > 2. We will employ 
all three possibilities of identifying M given by Propositions 5.7, 5.2, and 5.4. Using 
the last possibility, we will use the set X k +i defined by Definition 3.11 that is a base 
of M according to Lemma 3.12. 

By the bound (10) of Lemma 3.13 and the fact that \C(X)\ < S(M) for every 
X C V(M), we have 

\X k+1 \ =n- \Z\ < 2k 2 5(M) - (k- 1). (39) 

We now consider two cases. 

Case 1: Z = 0. By (39) we have 5(M) > ^gr 1 . By Proposition 5.7, this implies 
that 

BS(M)<(l-JL)n + *. 

Case 2: Z ^ 0. In this case for the fineness of the base X k+ i we have /(X fc+1 ) > 
k + 2. Using (39), we obtain 

p(X k+1 ) < 2A; 2 5(M)-(A;-l)+max{|C , | : C G C(X k+1 )}+l < 2k 2 5(M)+a(M)+2-k. 
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Let X(M) = max{5(M),a(M)}. By Propositions 5.7, 5.2, and 5.4, we have 



BS(M) < mm{n + k-5( y M),n + k~a( y M),2k 2 5( y M)+ ( r( y M) + 2-k} 

< mm{n + k-X(M),(2k 2 + l)X(M)+2-k} 

< max minin + k — X, (2k 2 + 1)A + 2 - k] 

~ l<X<n L V ' J 

£ ( 1 -2^T2)" + ' ; -^TT < ( 1 -2^T2)" + fc 

Since the latter bound holds in both the cases, the proof of Theorem 5.1 is for k > 2 
complete. 

In the case of k = 1 we use Propositions 5.2 and 5.4. We use the fact that, for a 
structure M with all relations unary, the empty set is a base and p(0) = a(M) + 1. 
We therefore have BS (M) < min{n + 1 - a(M), a(M) + 1} < n/2 + 1. 



6 Identifying finite structures by Bernays-Schon- 
finkel formulas with bounded number of uni- 
versal quantifiers 

Recall that BS g (M) denotes the minimum total number of quantifiers in a Bernays- 
Schonfinkel formula identifying M with at most q universal quantifiers. We now 
address the asymptotics of the maximum value of BS q (M) over structures of order 
n under the condition that q is bounded by a constant. We first observe that less 
than k universal quantifiers are rather useless for identification of a structure with 
maximum relation arity k. 

Proposition 6.1 If M is a structure of order n with maximum relation arity k and 
n>k, then BS fc -i(M) = n. 

Proof. We have to show that no formula $ = 3yi . . . 3y p \/x 1 . . . \/x q ^(y, x) with \1> 
quantifier-free, q < k — 1, and p + q < n — 1 can identify M. Suppose that 

M,b,a^V(y,x) (40) 

for some b G V(M) P and all a G V(M) q . Let A = V(M) \ {b u ...,b p }. Since 
q + 1 < k, q + 1 < n — p < \A\, and n > k, there is a /c-element U C V(M) such 
that \U n A\ > q + 1. Let u±, . . . ,Uk be an arbitrary ordering of U. Let R be a 
/c-ary relation of M . Define a relation R' so that R'u ^ Ru and R' coincides with R 
elsewhere. Let M' be the modification of M with R' instead of R. Clearly, M' and 
M are non-isomorphic. It is easy to see that M', b, a \= ^(y, x) for the same b as in 
(40) and all a G V{M') q . Therefore M' \= $ and $ fails to identify M. ■ 

If at least k universal quantifiers are available, some saving on the number of 
quantifiers is possible: It turns out that BSfe(M) < n — \fn + k 2 + k and this bound 
cannot be improved much if we keep the number of universal quantifiers constant. 



33 



Theorem 6.2 Let BS q (n,k) denote the maximum BS g (M) over structures M of 
order n and maximum relation arity k. Then 

BSfe(n, k) < n — \pn + k 2 + k. 

On the other hand, if n is a square, then 

BS g (n, k)>n-(q-l)y/n + q 

for every q > 2 and k > 2. 

The upper bound of Theorem 6.2 is provable by the techniques from Section 5. 
Let M be a structure of order n with maximum relation arity k. By Propositions 
5.7 and 5.2, 

BS fc (M) < n + k - max{5(M), cr(M)}. 
It remains to prove the following bound. 

Lemma 6.3 max{<5(M), cr(M)} > y/n - k 2 . 

Proof. By the bound (10) of Lemma 3.13, 

k— 1 

n + k - 1 < 2k^2 + (fc + l)|C fc+1 (X fe )| + (fc - l)\C(X k )\ + |Z|. 

i=i 

We bound each term |C(X)| from above by 5(M). Furthermore, we bound \Z\ 
from above by the number of =x fe+ i- e( iuivalence classes inside Z multiplied by the 
maximum number of elements in such a class. By Lemma 3.12 it follows that 
\Z\ < 8(M)<j(M). We therefore conclude that 

n + k — 1 < 5(M)(2k 2 + <t(M)). 

This implies 

{ n _l_ ^ ]_ 1 
2k 2 + o J 

as required. ■ 



Remark 6.4 The bound of Lemma 6.3 is essentially optimal because, for any graph 
G of order m 2 whose vertex set is partitioned into m ~-equivalence classes of m 
element each, it holds cr(G) = m and 5(G) < m. Such G can be constructed from 
any graph H of order m whose automorphism group contains no transposition by 
replacing each vertex v G V(H) with m pairwise (non-)adjacent vertices ~-related 
to v in H . 

We now prove the lower bound of Theorem 6.2. It suffices to do it for graphs. 
The example of G with large BS g (G ! ) will be the same as in Remark 6.4. This 
example can be lifted to a higher arity k by adding k — 2 dummy coordinates to the 
adjacency relation with no affect to its truth value. 
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Proposition 6.5 Let G m he graph of order m 2 whose vertex set is partitioned 
into m ^-equivalence classes of m element each. Let q > 2. Then BS q (G m ) > 
m 2 — (q — l)m + q. 

Proof. It is enough to show that, if G m is identified by a Bernays-Schonfinkel 
formula $ with q universal quantifiers, then $ contains at least m 2 — (q — l)m 
existential quantifiers. If q > m + 1, this is trivial. Assume that q < m. 

Suppose on the contrary that G m is identified by a Bernays-Schonfinkel formula 
$ = 3yi . . . Ek/pVrrx . . . \/x q ^(y, x) with p < m 2 — (q — l)m. Let b G V(G m ) p be such 
that G m , b \= V^i . . . \/x q ^/(y, x). Equivalently, 

G m , Mh x) for all a G V(G m y. (41) 

Let A = V(G m )\{bi, . . . , b p }. We have \A\ > (q — l)m+l. The condition imposed on 
G rn implies that there are two ^-equivalence classes, C\ and C2, such that |AnCi| > 
q and \A D C2I > 1. Let us modify G rn by removing one vertex from A H C2 and 
adding a new vertex f' to Ci so that f ' ~ f for all f G Ci. The modified graph, G', 
is clearly non-isomorphic to G m . We show that, nevertheless, G' |= $. 

It suffices to show that G", 6, a' |= x) for every a' G ^(G") 9 . In view of (41), 
we are done if for every a' G V(G') q we are able to find an a G V(G m ) q such that 
the component-wise correspondence between b, a and b, a' is a partial isomorphism 
between G rn and G' . If a' does not contain any occurrence of v', we obviously can 
take a = a'. If a' contains an occurrence of v', let v be a vertex in A n Ci that does 
not occur in a' and let a be the result of substituting v in place of v' everywhere in 
a'. It is not hard to see that the obtained a is as required. ■ 

7 The case of graphs 

For a binary structure M, Theorem 4.1 implies I (M) < 0.75n + 4 and Theorem 
5.1 implies BS (M) < 0.9n + 2. In the case of graphs, both these bounds can be 
improved. In [14] we obtain an almost optimal bound 1(G) < (n + 3)/2 (there 
are simple examples of graphs with 1(G) > (n + l)/2). Combining the approach 
from [14] and the techniques from Section 5, we are able to prove a better bound 
for BS (G) as well. We are also interested in knowing the smallest n starting from 
which for G of order n we have the bound at least BS (G) < n — 1, an improvement 
on the trivial bound of n. 

Theorem 7.1 Let G he a graph of order n. 

1) We have BS (G) < 3n/4 + 3/2. 

2) Ifn> 5, we have BS2(G) < n — 1 with the only exception of the graph H on 
5 vertices with 2 adjacent edges for which, nevertheless, we have BS 3 (if) < 4. 

Proof. Given a graph G, let X = E($), where the transformation E is introduced 
in Section 3.2. We state two properties of the X established in [14]: 



35 



Property 1. \C(X)\ > \X\ + 1. 

Property 2. Let Y = Y(X) and Z = V(G) \ (X U Y). Every class in C(X U Y) 
consists of pairwise ~-equivalent vertices. 

Note that \C(X)\ < 5(G). By Property 1 we conclude that 

\X\ + \Y\ < \X\ + \C(X)\ < 2\C(X)\ - 1 < 25(G) - 1. (42) 

Property 2 means that X U Y is a base of G. 
We now consider two cases. 

Case 1: Z = 0. In this case n = \X\ + \Y\. By (42), 5(G) > (n + l)/2. Using 
Proposition 5.7, we obtain BS2(G) < n/2 + 3/2. 

Case 5; Z ^ 0. In this case p(X U F) < |X| + \Y\ + (7(G) + 1. By (42) we have 
p(X UY)< 25(G) + (7(G). Denote A(G) = max{5(G), <r(G)}. By Propositions 5.7, 
5.2, and 5.4, we have 

BS(G) < mm{n + 2-5(G),n + 2-a(G),25(G) + o-(G)} 

< min{n + 2- X(G),3X(G)} 

< max min{n + 2-A,3A} = 3n/4 + 3/2. 

l<A<n 

Since this bound holds true in both the cases, Item 1 of the theorem is proved. 

To prove Item 2, we estimate max{5(G), a(G)}. Since n — \X\ + \Y\ + \Z\ < 
25(G) - 1 + 5(G)a(G), we have 

n + 1 <5(G)(2 + a(G)). (43) 

It follows that 

r n _|_ \ -\ 

max{5(G), o~(G)} > min max <^ c, > = \Jn + 2 — 1 

l<c<n 1^ 2 + c J 

and hence 

max{5(G),a(G)} > 3 (44) 

whenever n > 7. 

Claim A. The bound (44) holds for all G of order 6. 

This claim is proved by the direct brute force analysis. Making it, it suffices to 
consider graphs on 6 vertices with at most 7 edges. The reason is that 5(G) = 5(G) 
and o~(G) = o~(G), where G denotes the complement of a graph G, i.e., the graph 
with the same vertex set and exactly those edges absent in G. 

Claim B. The bound (44) holds for all G of order 7. 

Proof of Claim. Given three vertices x, y, and z, we say that x separates y and z 
if x is adjacent to exactly one of y and z. Note that y ~ z iff no x separates these 
vertices. 

Let a graph G have 7 vertices. If cr(G) = 1, then 5(G) > 3 by (43). Our task is 
therefore to deduce 5(G) > 3 from the assumption that o~(G) = 2. 
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Let u and v be ~-equivalent vertices of G. Suppose that they are adjacent. We 
do not lose the generality because it suffices to prove the claim for one of G or G. 
Let us remove u, that is, consider the graph G — u = G[V(G) \ {u}]. As it is easy 
to see, 5(G) > 5(G — u). Thus, if 5(G — u) > 3, we are done. Otherwise, by Claim 
A, we have a(G — u) > 3. 

Let a, b, and c be pairwise ~-equivalent vertices in G — u. Assume for a while 
that v {a,b,c}. Since u ~ v in G, the vertex u does not separate a, b, and 
c because v does not. Thus, these three vertices are pairwise ^-equivalent in G, 
contradicting our assumption that cr(G) = 2. We conclude that v G {a,b,c}. 

Without loss of generality, assume that v — c. Since v cannot be ~-equivalent 
with a or b in G and only u can separate v from a and from b, the vertex u must be 
non-adjacent to a and to b. The same is true for the ~-equivalent vertex v. Thus, 
{u,a}, {u,b}, {v,a}, and {v , b} all are non-edges. Since {a, v } is a non-edge, {a, 6} 
is a non-edge too because f = c~6inG — u. Note that a ~ b in G because this is 
so in G — u and u does not separate these two vertices. 

Apply now the same trick with removal of a instead of u. If we are not done, then 
G — a has a ^-equivalence class {b,s,t}. Our argument is completely symmetric 
with the difference that a and b are now non-adjacent vertices. We hence should 
switch over all (non) adjacencies and conclude that {a, s}, {a, t}, {b, s}, {&,£}, and 
{s, t} all are edges of G. Furhtermore, s ~ t in G. 

Notice that s, t, u, and v are pairwise distinct. Indeed, since u separates b and 
v, we have v ^ {s,t}. Similarly, u {s,t}. 

We apply the same trick once again, now with removal of s. As above, unless we 
are done, G — s has a ~-equivalence class A containing t and at least 2 more vertices. 
As above, the vertices a and b cannot belong to A. It follows that A contains at least 
one of u and v. Actually A must contain both u and v because these vertices are 
~-equivalent. As A is not a ~-class in G, the vertex s must separate t from u and 
from v. Therefore {s,u} and {s,v} are non-edges. As t ~ s, {t,u} and {t,v} are 
non-edges too. But now the triple a, s,u shows that 5(G) > 3. Indeed, v separates 
u from a and from s while b separates a and s. □ 

Thus, if G has order at least 6, we have the bound (44) and the theorem follows 
from Propositions 5.7 and 5.2. For graphs of order 5 the estimate (44) holds with 
the only exception for the specified graph H . This graph is identified by formula 



^y 1 \/xi\/x 2 \lxz Dist(yi, x 1 ,x 2 , x 3 ) -> ~^E(x 1 ,x 2 ) A \f E(y 1 ,x i ) A \f -^E(y u Xi) 



Remark 7.2 Item 2 of Theorem 7.1 does not hold true for graphs of order n = 4: 
It is not hard to prove that BS (F) = 4 for the graph F on 4 vertices with 1 edge. 

Remark 7.3 In [11] we address the first order definability of a random graph G on 
n vertices. It is proved that, with probability 1 — o(l), 

log 2 n - 2 log 2 log 2 n < 1(G) < log 2 n - log 2 log 2 n + log 2 log 2 log 2 n + 0(1). 



3 



3 




i=l 



where E is the adjacency relation. 
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One of the ingredients of the proof is that, with high probability, 5(G) > n — (2 + 
o(l))log 2 n. Since 1(G) < BS 2 (G) < n + 2 - 5(G), we conclude that, with high 
probability, 

log 2 n - 2 log 2 log 2 n < BS 2 (G) < (2 + o(l)) log 2 n. 



8 Open problems 



1. Let l(n, fc) (resp. li(n,k); BS (n, k)) be the maximum I (M) (resp. h(M); 
BS (M)) over structures of order n with maximum relation arity k. We now know 
that 

f <I(n,fc)<Ii(n,fc)<BS(n,fc)<(l-^)ri + fc 

(45) 

and Ii(n, fc) < (1 - ^)n + k 2 - k + 2. 

Note that I (n, k) < l(n, k + 1) and that the lower bound of n/2 is actually for 
I (n, 1). Make the gap between the lower and upper bounds in (45) closer. 
The case of k = 2 is essentially solved in [14], where the bounds 

^<I(n,2)<I l( n,2)<^ 

are proved. If k — 3, we are able to improve on (45) by showing that Ii(n, 3) < 
\n + 0(1) (in [14] this bound was obtained for 3-uniform hypergraphs). 

2. Can one prove a non-trivial upper bound for lo(n, fc)? The weakest question 
is if I (n, fc) < n — 1. It is easy to show that I (n, 1) < (n + l)/2. In [14] we prove 
that 10(G) < (n + 5)/2 for graphs of order n. 

3. What happens if we restrict the number of existential rather than universal 
quantifiers in an identifying Bernays-Schonfinkel formula? 
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